-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement open telemetry integration - client/server and explict token latency #296
base: main
Are you sure you want to change the base?
Conversation
9bf7090
to
d15ac8c
Compare
// bag := baggage.FromContext(ctx) | ||
// span.AddEvent("handling this...", trace.WithAttributes(uk.String(bag.Member("username").Value()))) | ||
|
||
_, _ = io.WriteString(w, "Hello, world!\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ppalucki, what's the purpose of this response?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhlsunshine I accidently copied it from some working branch of streaming, later rebased - now it is removed.
// span.AddEvent("handling this...", trace.WithAttributes(uk.String(bag.Member("username").Value()))) | ||
|
||
_, _ = io.WriteString(w, "Hello, world!\n") | ||
ctx, cancel := context.WithTimeout(req.Context(), 3600*time.Minute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ppalucki, may 1 hour time out be too long?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhlsunshine fixed (long timeout was used it for my debugging sessions)
Hi @ppalucki, thanks for your effort on this PR!
I think there should be a measurement after I think you also can support the measurement metrics in |
//mux.Handle("/dataprep", otelhttp.NewHandler(http.HandlerFunc(mcDataHandler), "mcDataHandler")) | ||
|
||
// Export metric using Prometheus client built-in handler. | ||
mux.Handle("/metrics", promhttp.Handler()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ppalucki, is it possible to registry this router metrics
via function handleFunc
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhlsunshine fixed (added)
yes, it is possible, initally I didn't want to include /metrics in stats
but you're right - it should be covered and excluded later in visualization/post processing
I quickly added:
I need to test it more carefully. here is example output:
|
735d3be
to
d1f3b04
Compare
Small issues and also collect stats from /metrics router remove examples + add allTokenLatency metric Fix description of histograms pipeline latency histogram before token measurments
d1f3b04
to
5548dfd
Compare
Hi @ppalucki, I think you may need to fix the DCO error by adding the |
provider := sdkmetric.NewMeterProvider(sdkmetric.WithReader(exporter)) | ||
otel.SetMeterProvider(provider) | ||
|
||
// ppalucki: Own metrics defintion bellow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typos:
// ppalucki: Own metrics defintion bellow | |
// ppalucki: Own metrics definition below |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch!
var ( | ||
firstTokenLatencyMeasure metric.Float64Histogram | ||
nextTokenLatencyMeasure metric.Float64Histogram | ||
allTokenLatencyMeasure metric.Float64Histogram | ||
pipelineLatencyMeasure metric.Float64Histogram | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO this would look nicer as:
var ( | |
firstTokenLatencyMeasure metric.Float64Histogram | |
nextTokenLatencyMeasure metric.Float64Histogram | |
allTokenLatencyMeasure metric.Float64Histogram | |
pipelineLatencyMeasure metric.Float64Histogram | |
) | |
type struct LatencyMeasureT { | |
firstToken, nextToken, allTokens, pipeline metric.Float64Histogram | |
} | |
var latencyMeasure latencyMeasureT |
firstTokenLatencyMeasure, err = meter.Float64Histogram( | ||
"llm.first.token.latency", | ||
metric.WithUnit("ms"), | ||
metric.WithDescription("Measures the duration of first token generation."), | ||
api.WithExplicitBucketBoundaries(1, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16364), | ||
) | ||
if err != nil { | ||
log.Error(err, "metrics: cannot register first token histogram measure") | ||
} | ||
nextTokenLatencyMeasure, err = meter.Float64Histogram( | ||
"llm.next.token.latency", | ||
metric.WithUnit("ms"), | ||
metric.WithDescription("Measures the duration of generating all but first tokens."), | ||
api.WithExplicitBucketBoundaries(1, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16364), | ||
) | ||
if err != nil { | ||
log.Error(err, "metrics: cannot register next token histogram measure") | ||
} | ||
|
||
allTokenLatencyMeasure, err = meter.Float64Histogram( | ||
"llm.all.token.latency", | ||
metric.WithUnit("ms"), | ||
metric.WithDescription("Measures the duration to generate response with all tokens."), | ||
api.WithExplicitBucketBoundaries(1, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16364), | ||
) | ||
if err != nil { | ||
log.Error(err, "metrics: cannot register all token histogram measure") | ||
} | ||
|
||
pipelineLatencyMeasure, err = meter.Float64Histogram( | ||
"llm.pipeline.latency", | ||
metric.WithUnit("ms"), | ||
metric.WithDescription("Measures the duration to going through pipeline steps until first token is being generated (including read data time from client)."), | ||
api.WithExplicitBucketBoundaries(1, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16364), | ||
) | ||
if err != nil { | ||
log.Error(err, "metrics: cannot register pipeline histogram measure") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a bit of duplication. Maybe this could iterate over slice like (untested code):
metrics := struct {
metric *metric.Float64Histogram
name string
desc string
}[] = {
{ &latencyMeasure.firstToken, "llm.first.token.latency", "..." },
...
}
for _, item := metrics {
item.metric, err = meter.Float64Histogram(
item.name,
metric.WithUnit("ms"),
metric.WithDescription(item.desc),
api.WithExplicitBucketBoundaries(1, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16364),
)
if err != nil {
log.Errorf(err, "metrics: cannot register '%s' histogram measure", item.name)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch!
See also: opea-project/GenAIComps#845 |
Description
Integrate OpenTelemetry (metrics only) and expose it using Promteheus exporter for:
Additional feature is integrated opentelemetry trace context propagation (not tested).
Metrics are exposed on /metric with Prometheus handler. Example output for metrics:
1) server metrics example
2) client metrics example
3) custom metrics example
4) Prometheus client default Go language metrics:
Issues
n/a
Type of change
Dependencies
go.opentelemetry.io/otel
go.opentelemetry.io/otel/exporters/prometheus
go.opentelemetry.io/otel/metric
go.opentelemetry.io/otel/sdk/metric
go.opentelemetry.io/otel/metric
github.com/prometheus/client_golang/prometheus/promhttp
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp
Tests
Manually using curl (examples show above) after running only with ChatQnA sample graph.
TODO: